AITopics | residual distillation

Collaborating Authors

residual distillation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts

Neural Information Processing SystemsDec-24-2025, 03:06:48 GMT

By transferring both features and gradients between different layers, shortcut connections explored by ResNets allow us to effectively train very deep neural networks up to hundreds of layers. However, the additional computation costs induced by those shortcuts are often overlooked. For example, during online inference, the shortcuts in ResNet-50 account for about 40 percent of the entire memory usage on feature maps, because the features in the preceding layers cannot be released until the subsequent calculation is completed. In this work, for the first time, we consider training the CNN models with shortcuts and deploying them without. In particular, we propose a novel joint-training framework to train plain CNN by leveraging the gradients of the ResNet counterpart.

name change, portable deep neural network, residual distillation, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.80)

Add feedback

Review for NeurIPS paper: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts

Neural Information Processing SystemsJan-25-2025, 03:51:27 GMT

Weaknesses: * There are numerous approaches to reduce ConvNet's memory footprint and computational resources at inference time, including but not limited to channel pruning, dynamic computational graph, and model distillation. Why is removing shortcut connection the best way to achieve the same goal? The baselines considered in Table 3 and 4 are rather lacking. For example, how does the proposed method compare to: 1. Pruning method that reduces ResNet-50 channel counts to match the memory footprint and FLOPs of plain-CNN 50. What will be the drop in accuracy?

memory footprint, portable deep neural network, residual distillation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Review for NeurIPS paper: Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts

Neural Information Processing SystemsJan-25-2025, 03:51:20 GMT

The new training scheme following the teacher-student paradigm to obtain comparable results to those of a resnet model, but without residual connections (shortcuts). Results are on par with SOTA and the approach is very interesting, although not necessarily very novel in principle (I encourage the authors to make this much clearer in the final text). All reviewers agree that this is a good contribution and that the rebuttal was helpful in reaching the final conclusion.

neurips paper, portable deep neural network, residual distillation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts

Neural Information Processing SystemsOct-10-2024, 09:40:38 GMT

gradient, portable deep neural network, residual distillation, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback